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Introduction to Automata Theory 


Automata theory : the study of abstract computing devices, or "machines" 


Before computers (1930), A. Turing studied an abstract machine (Turing 
machine) that had all the capabilities of today' s computers (concerning what they 
could compute). His goal was to describe precisely the boundary between what a 
computing machine could do and what it could not do. 

E Simpler kinds of machines (finite automata) were studied by a number of 
researchers and useful for a variety of purposes. 


E Theoretical developments bear directly on what computer scientists do today 
E Finite automata, formal grammars: design/ construction of software 
E Turing machines: help us understand what we can expect from a software 


E Theory of intractable problems: are we likely to be able to write a program 
to solve a given problem? Or we should try an approximation, a heuristic... 
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Why Study Automata Theory? 


Finite automata are a useful model for many important kinds of software and hardware: 


1. Software for designing and checking the behaviour of digital circuits 


2. The lexical analyser of a typical compiler, that is, the compiler component that 
breaks the input text into logical units 


3. Software for scanning large bodies of text, such as collections of Web pages, to 
find occurrences of words, phrases or other patterns 


4. Software for verifying systems of all types that have a finite number of distinct 
states, such as communications protocols of protocols for secure exchange 
n information 


Automata Theory, Languages and Computation - Mírian Halfeld-Ferrari — p. 3/1 


The Central Concepts of Automata Theory 
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Alphabet 


E A finite, nonempty set of symbols. 
E Symbol: X 


E Examples: 
E The binary alphabet: X = {0,1} 
E The set of all lower-case letters: X: = {a,b,..., z} 


E The set of all ASCII characters 
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Strings 


A string (or sometimes a word) is a finite sequence of symbols chosen from 
some alphabet 


Example: 01101 and 111 are strings from the binary alphabet X = {0, 1} 


E Empty string: the string with zero occurrences of symbols 
This string is denoted by < and may be chosen from any alphabet 
whatsoever. 


E Length of a string: the number of positions for symbols in the string 
Example: 01101 has length 5 
j e There are only two symbols (0 and 1) in the string 01101, but 5 positions for 
symbols 


BI Notation of length of w: |w| 
" Example: |011| = 3 and |e] = 0 
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Powers of an alphabet (1) 


If X is an alphabet, we can express the set of all strings of a certain length from that 
alphabet by using the exponential notation: 

E Y^: the set of strings of length k, each of whose is in X 

E Examples: 


E v? : (c), regardless of what alphabet X is. That is « is the only string of 


length 0 
E |f X: = {0,1}, then: 
Ll. SSO) 


j 2. X? = {00, 01,10,11} 
3. X? = (000, 001,010, 011, 100, 101, 110, 111) 
Note: confusion between X: and Xt: 
- 1. X is an alphabet; its members 0 and 1 are symbols 
2. Xlisa set of strings; its members are strings (each one of length 1) 
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Kleen star 


E v: The set of all strings over an alphabet £ 
E {0,1}* = (6,0,1,00,01,10, 11,000,...] 
8 > Ux UE U 
E The symbol x is called Kleene star and is named after the mathematician and 
logician Stephen Cole Kleene. 
S27 So 2853.28 MP 
THUS: 3? = EF uUe} 
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Concatenation 


Define the binary operation . called concatenation on ** as follows: 
If aà1a2a3...a$4 and b465... 5, are in X*, then 


a102Q03...Qn.b1b2...bm = a1a2a3...anb1b2...bm 


E Thus, strings can be concatenated yielding another string: 
If x are y be strings then x.y denotes the concatenation of x and y, that is, the 
string formed by making a copy of x and following it by a copy of y 

E Examples: 


1. x= 01101 and y = 110 
Then zy — 01101110 and yx — 11001101 
2. For any string w, the equations ew = we = w hold. 
That is, e is the identity for concatenation (when concatenated with any 
string it yields the other string as a result) 
E |f S and T are subsets of X:*, then 
S.T —ist|sceS,tcT) 
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Languages 


B |f Sis an alphabet, and L C X*, then L is a (formal) language over X. 

E Language: A (possibly infinite) set of strings all of which are chosen from some 
Dr 

E A language over X need not include strings with all symbols of £ 
Thus, a language over » is also a language over any alphabet that is a superset 
of X 


E Examples: 


E Programming language C 
Legal programs are a subset of the possible strings that can be formed 
from the alphabet of the language (a subset of ASCII characters) 


E English or French 


Automata Theory, Languages and Computation - Mírian Halfeld-Ferrari — p. 10/1 


Other language examples 


The language of all strings consisting of » Os followed by n 1s (n > 0): 


(e, 01,0011, 000111, ...] 


2. The set of strings of Os and 1s with an equal number of each: 


£e, 01, 10,0011, 0101, 1001, ...) 


=* is a language for any alphabet £ 
Ø, the empty language, is a language over any alphabet 


{e}, the language consisting of only the empty string, is also a language over any 
alphabet 
NOTE: Ø Æ {e} since 0 has no strings and (c) has one 


6. {w | w consists of an equal number of 0 and 1} 
T: 4091" gu 
8. {0°17 |0<i< j} 
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Important operators on languages: Union 


The union of two languages L and M, denoted L U M, is the set of strings that are in 
either L, or M, or both. 


Example 


If L = (001, 10, 111) and M = {e, 001} then 
LU M = {e,001, 10,111} 
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Important operators on languages: 


Concatenation 


The concatenation of languages L and M, denoted L.M or just LM , is the set of 
strings that can be formed by taking any string in L and concatenating it with any string 
in M. 


Example 
If L = (001, 10, 111) and M = 16,001) then 
L.M = (001, 10, 111,001001, 10001, 111001} 
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Important operators on languages: Closure 


The closure of a language L is denoted L* and represents the set of those strings that 
can be formed by taking any number of strings from L, possibly with repetitions (/.e., the 
same string may be selected more than once) and concatenating all of them. 


Examples: 


B if 7, = {0,1} then L* is all strings of 0 and 1 


B |f L = {0,11} then L* consists of strings of 0 and 1 such that the 1 come in 
pairs, e.g., 011, 11110 and e. But not 01011 or 101. 


Formally, L* is the infinite union U,., L* where L? = (e), Lt = L, and fori > 1 we 
have L* = LL... L (the concatenation of i copies of L). 
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Regular Expressions 
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Regular Expressions and Languages 


We define the regular expressions recursively. 


Basis: The basis consists of three parts: 


1. 


The constants e and () are regular expressions, denoting the language {e€} and 0), 
respectively. That is L(c) = {e} and L(0) = 0. 


If a is a symbol, then a is a regular expression. This expression denotes the 
language {a}, i.e., L(a) = {a}. 

NOTE: We use boldface font to denote an expression corresponding to a symbol 
A variable, usually capitalised and italic such as L, is a variable, representing any 
language. 
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Regular Expressions and Languages 


Induction: There are four parts to the inductive step, one for each of the three operators 
and one for the introduction of parentheses 


1. 


If E and F are regular expressions, then E + F is a regular expression denoting 
the union of L(E) and L(F). That is, L(E + F) = L(E) U L(F). 


If E and F are regular expressions, then EF is a regular expression denoting the 
concatenation of L( E) and L(F). Thatis, L(EF) = L(E)L(F). 

If Æ is a regular expression, then E* is a regular expression denoting the closure 
of L(E). That is, L(E*) = (L(E))*. 


If E is a regular expression, then (E) is a regular expression denoting the same 
as E. Formally, L((E)) = (L(E)). 
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The use of regular expressions: examples of 


applications 


E Definition of lexical analysers (compilers). 


E Used in operation systems like UNIX (a Unix-style): 
[A-2] [a-z]» [ ] [A-2] [A-z] 


represents capitalised words followed by a space and two capital letters. This 
expressions represents patterns in text that could be a city and a state, e.g., 
Ithaca NY. 

It misses multi-word city names such as Palo Alto CA 
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